feat: allow cloud deletes for all object types (with name-based safety protections)#877
Conversation
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@aj/feat/mcp/allow-cloud-deletes-on-all-objects' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@aj/feat/mcp/allow-cloud-deletes-on-all-objects'Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
Community SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
There was a problem hiding this comment.
Pull request overview
This PR enhances deletion safety for Airbyte Cloud resources by:
- Removing the "delete:" prefix option for safe-mode deletions, now requiring only "delete-me" or "deleteme" in resource names
- Adding name verification parameters to deletion operations to prevent accidental deletions
- Introducing new deletion functions for sources, destinations, and connections with built-in safety checks
- Removing try-except error handling blocks from deployment and operational functions
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
tests/integration_tests/cloud/test_custom_definitions.py |
Updated test to reflect removal of "delete:" prefix option and verify new "deleteme" alternative |
airbyte/mcp/cloud_ops.py |
Removed exception handling from deployment functions, added three new deletion functions with name-based safety protections, and fixed sync status return logic |
airbyte/mcp/_tool_utils.py |
Added documentation clarifying safe mode behavior |
airbyte/cloud/workspaces.py |
Added safe_mode parameters to deletion methods with updated documentation |
airbyte/cloud/connectors.py |
Updated documentation to reflect new safe-mode naming requirements |
airbyte/_util/api_util.py |
Implemented _is_safe_name_to_delete helper function, added safe_mode parameters to deletion functions, and added null-safety check for destination configuration |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds a safe_mode safety layer to deletion APIs requiring resource names to contain "delete-me" or "deleteme" (case-insensitive), propagates safe_mode and optional name args through workspace and MCP cloud deletion workflows, and refactors MCP cloud ops to remove broad try/except blocks and add explicit name verification on deletes. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor CLI/Caller
participant API as api_util.delete_*(id, name?, safe_mode)
participant Validator as _is_safe_name_to_delete
participant Backend as Airbyte Cloud API
CLI/Caller->>API: request delete(id, safe_mode=True)
rect rgb(247,250,245)
Note over API,Validator: Safe-mode validation (name required or fetched)
API->>API: if name not provided -> fetch resource name
API->>Validator: validate name contains "delete-me" or "deleteme" (ci)
Validator-->>API: valid / invalid
end
alt valid
API->>Backend: DELETE /v1/.../{id}
Backend-->>API: 200 OK
API-->>CLI/Caller: success message (includes resource name)
else invalid
API-->>CLI/Caller: raise PyAirbyteInputError (includes name/context)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Would you like me to flag specific changed lines in Possibly related PRs
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧬 Code graph analysis (1)tests/integration_tests/cloud/test_cloud_workspaces.py (5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
🔇 Additional comments (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (4)
airbyte/_util/api_util.py (1)
994-1003: Consider simplifying the boolean expression, wdyt?Using
any({...})with a set is a bit unconventional. A simpleorexpression would be more idiomatic and avoids the set construction overhead:def _is_safe_name_to_delete(name: str) -> bool: """Check if a name is safe to delete. Requires the name to contain either "delete-me" or "deleteme" (case insensitive). """ name_lower = name.lower() - return any({ - "delete-me" in name_lower, - "deleteme" in name_lower, - }) + return "delete-me" in name_lower or "deleteme" in name_lowerairbyte/mcp/cloud_ops.py (3)
907-911: Comment mentions outdated naming pattern - mind updating it?The inline comment references
"delete:"which is not a valid safe-mode pattern anymore. Only"delete-me"or"deleteme"are accepted, wdyt?# Safe mode is hard-coded to True for extra protection when running in LLM agents workspace.permanently_delete_source( source=source_id, - safe_mode=True, # Requires name-based delete disposition ("delete:" or "delete-me") + safe_mode=True, # Requires name to contain "delete-me" or "deleteme" )
966-970: Same stale comment here too!# Safe mode is hard-coded to True for extra protection when running in LLM agents workspace.permanently_delete_destination( destination=destination_id, - safe_mode=True, # Requires name-based delete disposition ("delete:" or "delete-me") + safe_mode=True, # Requires name to contain "delete-me" or "deleteme" )
1044-1050: And here as well - third occurrence of the stale comment.# Safe mode is hard-coded to True for extra protection when running in LLM agents workspace.permanently_delete_connection( - safe_mode=True, # Requires name-based delete disposition ("delete:" or "delete-me") + safe_mode=True, # Requires name to contain "delete-me" or "deleteme" connection=connection_id, cascade_delete_source=cascade_delete_source, cascade_delete_destination=cascade_delete_destination, )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
airbyte/_util/api_util.py(7 hunks)airbyte/cloud/connectors.py(1 hunks)airbyte/cloud/workspaces.py(5 hunks)airbyte/mcp/_tool_utils.py(1 hunks)airbyte/mcp/cloud_ops.py(13 hunks)tests/integration_tests/cloud/test_custom_definitions.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-06T23:44:31.534Z
Learnt from: aaronsteers
Repo: airbytehq/PyAirbyte PR: 411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Applied to files:
airbyte/mcp/cloud_ops.py
🧬 Code graph analysis (1)
airbyte/_util/api_util.py (1)
airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)
🔇 Additional comments (13)
airbyte/mcp/_tool_utils.py (1)
51-52: LGTM!The updated docstring accurately reflects the existing behavior - good addition to clarify when this check becomes a no-op.
airbyte/cloud/connectors.py (1)
503-504: LGTM!Docstring accurately reflects the updated safe_mode naming requirements - consistent with the centralized
_is_safe_name_to_deletehelper.airbyte/_util/api_util.py (5)
538-609: LGTM!The safe_mode implementation is well-designed:
- Lazy name fetching (only when needed)
- Clear, actionable error messages
- Appropriate context for debugging
747-761: LGTM!Good defensive change - handling the optional
raw_configurationprevents potential errors when the API response lacks a configuration field.
770-842: LGTM!Consistent implementation with
delete_source- same safe_mode pattern, error handling, and messaging.
1006-1078: LGTM!Consistent safe_mode implementation across all three deletion functions. The pattern is clear and maintainable.
1468-1500: LGTM!Properly uses the centralized
_is_safe_name_to_deletehelper and updated error messages align with the new naming requirements.tests/integration_tests/cloud/test_custom_definitions.py (1)
166-166: LGTM!The assertion correctly validates that the error message mentions the valid safe-mode naming patterns.
airbyte/cloud/workspaces.py (3)
250-278: LGTM!Good optimization - passing
source_namewhen available avoids an extra API call in the deletion path when safe_mode is enabled.
282-314: LGTM!Consistent pattern with
permanently_delete_source- properly passesdestination_nameto avoid redundant fetches.
368-415: LGTM!Nice implementation -
safe_modeis properly propagated to cascade deletions, ensuring consistent behavior for the entire deletion tree.airbyte/mcp/cloud_ops.py (2)
11-11: LGTM!Appropriate import for input validation in the MCP layer.
833-853: Nice defense-in-depth approach!The MCP layer adds name verification on top of the API's safe_mode check - two independent safety mechanisms that both must pass. This is appropriate protection when LLM agents are making destructive calls.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…h safety protections Co-authored-by: aaronsteers <18150651+aaronsteers@users.noreply.github.com>
Co-authored-by: aaronsteers <18150651+aaronsteers@users.noreply.github.com>
…lve conflicts Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
airbyte/mcp/cloud_ops.py (1)
1040-1091: Sameworkspace_idomission as source deletion.This function follows the same pattern as
permanently_delete_cloud_sourceand also omits theworkspace_idparameter. If you decide to add it to the source function, consider adding it here too for consistency.
🧹 Nitpick comments (2)
airbyte/mcp/cloud_ops.py (2)
981-1032: Missingworkspace_idparameter - is this intentional?I noticed this function doesn't accept a
workspace_idparameter, unlikepermanently_delete_custom_source_definition(line 924-930) and other similar functions likerename_cloud_source. This means it always uses the default workspace from environment variables.Was this a deliberate simplification, or should it include the optional
workspace_idparameter for consistency with the rest of the API, wdyt?def permanently_delete_cloud_source( source_id: Annotated[ str, Field(description="The ID of the deployed source to delete."), ], name: Annotated[ str, Field(description="The expected name of the source (for verification)."), ], + *, + workspace_id: Annotated[ + str | None, + Field( + description="Workspace ID. Defaults to AIRBYTE_CLOUD_WORKSPACE_ID env var.", + default=None, + ), + ] = None, ) -> str:
1099-1171: Cascade delete options look good, sameworkspace_idnote applies.The cascade delete options (
cascade_delete_source,cascade_delete_destination) are well-structured with clear Field descriptions. Thesafe_mode=Trueis correctly passed to the underlying call.Same note about missing
workspace_idparameter applies here. The consistency across all three new delete functions suggests this might be intentional though?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte/mcp/_tool_utils.py(1 hunks)airbyte/mcp/cloud_ops.py(14 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- airbyte/mcp/_tool_utils.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-06T23:44:31.534Z
Learnt from: aaronsteers
Repo: airbytehq/PyAirbyte PR: 411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.
Applied to files:
airbyte/mcp/cloud_ops.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (No Creds)
🔇 Additional comments (5)
airbyte/mcp/cloud_ops.py (5)
131-153: Clean refactoring of deploy operations.The removal of broad try/except blocks in favor of direct execution paths is a good improvement. Errors will now propagate with meaningful stack traces instead of being potentially swallowed or repackaged.
462-495: Solid null handling and structured response.The explicit check for missing
sync_resultwith a clear fallback response is a nice improvement. The structured dictionary response is clean and well-organized.
640-673: Thorough edge-case handling for log retrieval.All the null/empty cases are handled with clear, descriptive messages. The attempt lookup logic correctly iterates to find the specified attempt or falls back to the latest one.
953-973: Good safety pattern with name verification.The name mismatch check with detailed context in the exception is a solid safety mechanism. The strict case-sensitive comparison is appropriate for preventing accidental deletions.
1010-1010: Cast assumes name is always present.The
cast(str, source.name)assumes the API always returns a non-null name for deployed sources. This is probably safe in practice, but if there's ever a case wherenameisNone, the subsequent comparison would fail unexpectedly.Is this assumption documented or guaranteed by the API contract? If not, you might consider a defensive check, wdyt?
actual_name: str = source.name or "" if not actual_name: raise PyAirbyteInputError( message="Source has no name, cannot verify for deletion.", context={"source_id": source_id}, )
Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
airbyte/_util/api_util.py (3)
747-757: Consider tightening the type check onraw_configurationbefore splatting
raw_configurationis now optional, which is great, but we still assume it’s a mapping when we do**raw_configuration. Would it be worth guarding with anisinstanceto avoid a surprisingTypeErrorif the API ever returns a non-dictconfiguration(e.g.,Noneor a list) for a known destination type, wdyt?- raw_configuration: dict[str, Any] | None = raw_response.get("configuration") + raw_configuration: dict[str, Any] | None = raw_response.get("configuration") @@ - if destination_type in destination_mapping and raw_configuration is not None: - response.destination_response.configuration = destination_mapping[ - destination_type # pyrefly: ignore[index-error] - ](**raw_configuration) + if ( + destination_type in destination_mapping + and isinstance(raw_configuration, dict) + ): + response.destination_response.configuration = destination_mapping[ + destination_type # pyrefly: ignore[index-error] + ](**raw_configuration)
994-1005: Simplify_is_safe_name_to_deletefor readabilityThe helper behaves correctly, but the
any({ ... })with a set literal is a bit unusual for a simple OR check. Would simplifying this to a direct boolean expression help future readers scan it faster, wdyt?def _is_safe_name_to_delete(name: str) -> bool: @@ - name_lower = name.lower() - return any( - { - "delete-me" in name_lower, - "deleteme" in name_lower, - } - ) + name_lower = name.lower() + return "delete-me" in name_lower or "deleteme" in name_lower
1470-1495: Minor readability tweak and potential DRY opportunity in custom YAML deleteThe safe_mode behavior for custom YAML source definitions matches the other delete paths, which is great. Two tiny thoughts:
- Since you already assign
connector_name = definition_info.name, would calling_is_safe_name_to_deleteonconnector_nameread a bit cleaner?- Given that the safe_mode blocks and error messages are now very similar across sources, destinations, connections, and custom definitions, would a small internal helper (e.g.,
_ensure_safe_delete(resource_type, name, id, safe_mode)) help reduce duplication and keep future message changes in one place, wdyt?- connector_name = definition_info.name - - if not _is_safe_name_to_delete(definition_info.name): + connector_name = definition_info.name + + if not _is_safe_name_to_delete(connector_name): raise PyAirbyteInputError( message=( f"Cannot delete custom connector definition '{connector_name}' " "with safe_mode enabled. " "To authorize deletion, the connector name must contain 'delete-me' or " "'deleteme' (case insensitive).\n\n" "Please rename the connector to meet this requirement "
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte/_util/api_util.py(7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/_util/api_util.py (3)
airbyte/sources/util.py (1)
get_source(47-139)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)airbyte/destinations/util.py (1)
get_destination(21-95)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (No Creds)
🔇 Additional comments (3)
airbyte/_util/api_util.py (3)
541-591: Safe-mode guard fordelete_sourcelooks consistent and clearThe safe_mode flow (optional
source_name, fetch-if-missing, centralized_is_safe_name_to_deletecheck, and a detailedPyAirbyteInputError) looks solid and matches the intent described in the docstring. I don’t see any functional issues here, and the context payload on the error should be helpful for callers, wdyt?
773-824: Destination safe-mode deletion behavior matches source semanticsThe
delete_destinationsafe_mode flow mirrorsdelete_sourcenicely (optionaldestination_name, fetch on demand, centralized helper, and a descriptivePyAirbyteInputError). The consistency across these two entry points should make behavior predictable for callers; I don’t see any issues here, wdyt?
1009-1061: Connection safe-mode deletion is consistent with sources/destinationsThe
delete_connectionsignature and implementation follow the same safe_mode pattern (including fetchingconnection_namewhen needed and raisingPyAirbyteInputErrorwith good context), which keeps the API surface uniform across object types. Functionally this looks good to me; I don’t see any additional edge cases you need to handle right now, wdyt?
PyTest Results (Full)388 tests - 1 372 ✅ - 1 23m 56s ⏱️ - 3m 3s Results for commit b4dd9dd. ± Comparison against base commit a8cf82c. This pull request removes 1 test.♻️ This comment has been updated with latest results. |
Co-Authored-By: AJ Steers <aj@airbyte.io>
Summary
This PR adds safe mode deletion capabilities for all cloud object types (sources, destinations, connections, and custom source definitions) with name-based safety protections. Key changes:
permanently_delete_cloud_source,permanently_delete_cloud_destination, andpermanently_delete_cloud_connectionendpointsUpdates since last revision
workspace_idparameter threading)airbyte/_util/api_util.pyReview & Testing Checklist for Human
cascade_delete_sourceandcascade_delete_destinationoptions work correctly with safe modeRecommended test plan:
Notes
_is_safe_name_to_deletehelper function is now shared across all deletion operationsRequested by: Aaron ("AJ") Steers (@aaronsteers) (AJ Steers)
Devin session: https://app.devin.ai/sessions/9323978571ee49719d66f4b5f4005fa7
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests
✏️ Tip: You can customize this high-level summary in your review settings.